Hash Joins and Hash Teams in Microsoft SQL Server
نویسندگان
چکیده
The query execution engine in Microsoft SQL Server employs hash-based algorithms for inner and outer joins, semi-joins, set operations (such as intersection), grouping, and duplicate removal. The implementation combines many techniques proposed individually in the research literature but never combined in a single implementation, neither in a product nor in a research prototype. One of the paper’s contributions is a design that cleanly integrates most existing techniques. One technique, however, which we call hash teams and which has previously been described only in vague terms, has not been implemented in prior research or product work. It realizes in hash-based query processing many of the benefits of interesting orderings in sort-based query processing. Moreover, we describe how memory is managed in complex and bushy query evaluation plans with multiple sort and hash operations. Finally, we report on the effectiveness of hashing using two very typical database queries, including the performance effects of hash teams.
منابع مشابه
The Value of Merge-Join and Hash-Join in SQL Server
Microsoft SQL Server was successful for many years for transaction processing and decision support workloads with neither merge join nor hash join, relying entirely on nested loops and index nested loops join. How much difference do additional join algorithms really make, and how much system performance do they actually add? In a pure OLTP workload that requires only record-to-record navigation...
متن کاملAn Adaptive Hash Join Algorithm for Multiuser Environments
As main memory becomes a cheaper resource, hash joins are an alternative to the traditional methods of performing equi-joins: nested loop and merge joins. This paper introduces a modified, adaptive hash join method that is designed to work with dynamic changes in the amount of available memory. The general idea of the algorithm is to regulate resource usage of a hash join in a way that allows i...
متن کاملOn a Three-Way Hash Join Algorithm
We develop hash-based algorithms for computing a three-way join. The method involves hashing all three relations into buckets, and then joining buckets in main memory, three buckets at a time. Comparing to two-cascaded hash joins, the algorithms avoid materializing an intermediate result. We present a cost model for this approach, from which we identify the range of parameters for queries that ...
متن کاملAdapting Hash Joins For Modern Processors
Hash join algorithms are crucial to the performance of modern database systems. Conventional hash joins exhibit poor memory system performance on modern processors because their key data structure, the bucket-chain hash table, is ill-suited for the performance characteristics of out-of-order processors with large cache hierarchies. Whereas prior research has considered a variety of optimization...
متن کاملMemory-Efficient Hash Joins
We present new hash tables for joins, and a hash join based on them, that consumes far less memory and is usually faster than recently published in-memory joins. Our hash join is not restricted to outer tables that fit wholly in memory. Key to this hash join is a new concise hash table (CHT), a linear probing hash table that has 100% fill factor, and uses a sparse bitmap with embedded populatio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998